The LIA summarization system at DUC-2007

نویسندگان

  • Florian Boudin
  • Frederic Bechet
  • Marc El-Bèze
  • Benoit Favre
  • Laurent Gillard
  • Juan-Manuel Torres-Moreno
چکیده

This paper presents the LIA summarization systems participating to DUC 2007. This is the second participation of the LIA at DUC and we will discuss our systems in both main and update tasks. The system proposed for the main task is the combination of seven different sentence selection systems. The fusion of the system outputs is made with a weighted graph where the cost functions integrate the votes of each system. The final summary corresponds to the best path in this graph. Our experiments corroborate the results we obtained at DUC 2006, the fusion of the multiple systems always outperforms the best system alone. The update task introduces a new kind of summarization, the over the time update summarization. We propose a cosine maximization-minimization approach. Our system relies on two main concepts. The first one is the cross summary redundancy removal which tempt to limit the redundancy between the update summary and the previous ones. The second concept is the novelty detection in a cluster of documents. In the DUC 2007 main and update evaluations, our systems obtained very good results in both automatic and human evaluations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The LIA Update Summarization Systems at TAC 2008 (DRAFT)

For the third participation of the LIA to the DUC–TAC conferences, two summarizers were developed. The first is based on the SMMR sentence scoring algorithm described in (Boudin et al., 2008). The second summarizer is a fusion between two sentence scoring methods: SMMR and a variable length insertion gap n-term model (Favre et al., 2006; Boudin et al., 2007). We compare our two summarizers usin...

متن کامل

The LIA-Thales summarization system at DUC-2006

The LIA-Thales system is made of five different sentence selection systems and a fusion module. Among the five sentence selection systems used, two were originally developed for the Question-Answering task (QA) and three specifically built for DUC-2006. The outputs of the five systems are combined in a weighted graph where the cost functions integrate the votes given by the different systems to...

متن کامل

ICT CAS at DUC 2007

This paper presents our multi-document summarization system ICTGSP-S at DUC 2007. We propose a new method for representing and summarizing documents by integrating subtopics partition with graph representation. The method starts from the assumption that capturing subtopic structure of document collection is essential for summarization. The evaluation results show the benefit of this approach.

متن کامل

IIIT Hyderabad at DUC 2007

In this paper we report our performance at DUC 2007 summarization tasks. We participated both in the query-focused multidocument summarization main task and in a pilot update summary generation tasks. This year we used a term clustering approach to better estimate a sentence prior. We used only the sentence prior which is query independent, in the update summarization task and found that it’s p...

متن کامل

A Term Frequency Distribution Approach for the DUC-2007 Update Task

We present our system used in the DUC 2007 update task, which is our first entry in any of the DUC evaluations. We make use of ideas within our existing FreqDistSumm text summarizer, which has been shown to perform well in biomedical text summarization. Our system submitted to the DUC Update Task, called FreqDistUpdate, uses a context sensitive approach to scoring sentences based on a frequency...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007